An Extended Reservoir of Class-D Beta-Lactamases in Non-Clinical Bacterial Strains

ABSTRACT Bacterial genes coding for antibiotic resistance represent a major issue in the fight against bacterial pathogens. Among those, genes encoding beta-lactamases target penicillin and related compounds such as carbapenems, which are critical for human health. Beta-lactamases are classified into classes A, B, C, and D, based on their amino acid sequence. Class D enzymes are also known as OXA beta-lactamases, due to the ability of the first enzymes described in this class to hydrolyze oxacillin. While hundreds of class D beta-lactamases with different activity profiles have been isolated from clinical strains, their nomenclature remains very uninformative. In this work, we have carried out a comprehensive survey of a reference database of 80,490 genomes and identified 24,916 OXA-domain containing proteins. These were deduplicated and their representative sequences clustered into 45 non-singleton groups derived from a phylogenetic tree of 1,413 OXA-domain sequences, including five clusters that include the C-terminal domain of the BlaR membrane receptors. Interestingly, 801 known class D beta-lactamases fell into only 18 clusters. To probe the unknown diversity of the class, we selected 10 protein sequences in 10 uncharacterized clusters and studied the activity profile of the corresponding enzymes. A beta-lactamase activity could be detected for seven of them. Three enzymes (OXA-1089, OXA-1090 and OXA-1091) were active against oxacillin and two against imipenem. These results indicate that, as already reported, environmental bacteria constitute a large reservoir of resistance genes that can be transferred to clinical strains, whether through plasmid exchange or hitchhiking with the help of transposase genes. IMPORTANCE The transmission of genes coding for resistance factors from environmental to nosocomial strains is a major component in the development of bacterial resistance toward antibiotics. Our survey of class D beta-lactamase genes in genomic databases highlighted the high sequence diversity of the enzymes that are able to recognize and/or hydrolyze beta-lactam antibiotics. Among those, we could also identify new beta-lactamases that are able to hydrolyze carbapenems, one of the last resort antibiotic families used in human antimicrobial chemotherapy. Therefore, it can be expected that the use of this antibiotic family will fuel the emergence of new beta-lactamases into clinically relevant strains.


Preparing Revision Guidelines
To submit your modified manuscript, log onto the eJP submission site at https://spectrum.msubmit.net/cgi-bin/main.plex. Go to Author Tasks and click the appropriate manuscript title to begin the revision process. The information that you entered when you first submitted the paper will be displayed. Please update the information as necessary. Here are a few examples of required updates that authors must address: • point-by-point responses to the issues I raised in your cover letter • Upload a compare copy of the manuscript (without figures) as a "Marked-Up Manuscript" file. • Each figure must be uploaded as a separate file, and any multipanel figures must be assembled into one file. For complete guidelines on revision requirements, please see the journal Submission and Review Process requirements at https://journals.asm.org/journal/Spectrum/submission-review-process. Submissions of a paper that does not conform to Microbiology Spectrum guidelines will delay acceptance of your manuscript. " Please return the manuscript within 60 days; if you cannot complete the modification within this time period, please contact me. If you do not wish to modify the manuscript and prefer to submit it to another journal, please notify me of your decision immediately so that the manuscript may be formally withdrawn from consideration by Microbiology Spectrum.
If your manuscript is accepted for publication, you will be contacted separately about payment when the proofs are issued; please follow the instructions in that e-mail. Arrangements for payment must be made before your article is published. For a complete list of Publication Fees, including supplemental material costs, please visit our website. Interestingly, 801 known class D beta-lactamases fell into only 18 clusters. To probe the unknown diversity of the class, we selected ten protein sequences in ten uncharacterized clusters and studied the activity profile of the corresponding enzymes. A beta-lactamase activity could be detected for seven of them. Three enzymes were active against oxacillin and two against imipenem. These results indicate that, as already reported, environmental bacteria constitute a large reservoir of resistance genes that can be transferred to clinical strains, whether through plasmid exchange or hitchhiking with the help of transposase genes.

Importance
The transmission of genes coding for resistance factors from environmental to nosocomial strains is a major component in the development of bacterial resistance towards antibiotics. Our survey of class D beta-lactamase genes in genomic databases highlighted the high sequence diversity of the enzymes that are able to recognize and/or hydrolyze beta-lactam antibiotics. Among those, we could also identify new beta-lactamases that are able to hydrolyze carbapenems, one of the last resort antibiotic families used in human chemotherapy. Therefore, it can be expected that the use of this antibiotic family will fuel the emergence of new beta-lactamases into clinically relevant strains. Introduction Beta-lactamases are the main enzymes responsible for the resistance of bacteria to beta-lactams, which are the major antibiotics utilized in the fight against pathogenic bacteria. Even before the structure of penicillin was known, Abraham and Chain (1) described "An enzyme from bacteria able to destroy penicillin" and, in the late 40's and early 50's, the staphylococcal beta-lactamase became an important source of clinical problems solved by the introduction of methicillin (2,3). Later, an ever increasing number of these hydrolases have been identified. These can be classified into four classes based on their primary structures. Classes A, C and D are activeserine enzymes (4) while class B consists of metallo-proteins whose active site usually contains 1 or 2 Zn ++ ions (5,6).
Beta-lactamases of classes A and D exhibit a very high diversity of amino acid (AA) sequences, with only a very little number of conserved residues within each class (e.g., 29 residues are conserved within class-A beta-lactamases) (7). It is nearly impossible to establish clear relationships between AA sequences and the ability to hydrolyse specific classes of beta-lactam antibiotics. Indeed, it is well known that a single mutation can alter this activity profile in a significant manner (8,9). Moreover, the literature contains numerous disagreements and errors concerning the kinetic parameters of various enzymes (10). This is probably in part because these parameters are often determined under different experimental conditions and the studied enzymes are not always pure. In consequence, even though clinicians are more interested in specificity profiles, the AA sequences remain the primary tool for proposing a classification of beta-lactamases, as in the case of the Beta-Lactamase Database (BLDB; http://www.bldb.eu/) (11). Concerning class D beta-lactamases, the situation is complicated by the fact that these enzymes can dimerize, which sometimes modifies the activity (12) and that carboxylation of the first conserved motif lysine also increases the activity in most cases (13). Inversely, loss of CO2 during turn-over of the substrate results in "substrate-induced inactivation", a phenomenon already observed by Ledent et al (14).
The first two identified class D beta-lactamases exhibited a number of features that differed from those of nearly all beta-lactamases known at the time, notably the ability to efficiently hydrolyze oxacillin and other isoxazolyl penicillins. For this reason, they were named OXA-1 and OXA-2. Unfortunately, it was then decided to name the further class D beta-lactamases homologs "OXA" plus sequence (i.e., increasing) number that follows the chronological order of identification (15). This was sometimes done in spite of a sequence identity below 30% and/or (10) more similarity with the BlaR receptor than with other class D beta-lactamases (16). Class D beta-lactamases were first identified as plasmid-encoded proteins but the corresponding genes were later found to reside on the bacterial chromosome too (10).
Similarity searches using the OXA-2 AA sequence as query revealed homologous primary structures of unknown function or without true beta-lactamase activity, such as YbxI/BSD-1 in Bacillus subtilis (17,18), or even devoid of any beta-lactamase activity, such as the C-terminal domain (CTD) of the BlaR penicillin-receptor involved in the induction of a class A beta-lactamase in Bacillus licheniformis and Staphylococcus aureus (19). In the present study, proteins containing a class-D beta-lactamase domain will be further referred to as the "OXA-domain family". Among those, "DBL" will be reserved to demonstrably active class-D betalactamases, while characterized class-D beta-lactamase homologs of low activity or with a different function will be termed "pseudo-DBL" proteins. Finally, "DBLhomolog" proteins will design the union of DBL, pseudo-DBL proteins, and other homologs not yet characterized.
It is clear that our present knowledge of the OXA-domain family is biased toward clinically relevant DBLs. The analysis of whole genome sequences of isolated bacteria and metagenome-assembled genomes highlighted that non-pathogenic and environmental bacteria can also harbor beta-lactamase-encoding genes, and thus may behave as reservoirs of emerging new resistant genes identified in nosocomial strains (20,21). It is likely that these bacteria, which were never exposed to synthetic or semi-synthetic beta-lactams used in human health care or animal husbandry, can encounter other beta-lactam-producing microorganisms in their natural environment and, over the ages, have acquired beta-lactamase genes in their "struggle for life" (22). A significant example could be the carbapenems that can be produced by some Streptomyces species (23), probably resulting in the appearance of the carbapenemases that were later transferred to clinical strains (24,25). The large heterogeneity of the resistance gene repertoire present in bacteria challenges the efficiency of chemotherapy. It also underlines the need to develop new analytical methods allowing a clear and rapid identification of potential new resistance pathways including enzymes that can inactivate both old and new antibiotics.
The goals of the present article were to explore genomic databases to discover how widespread the class D beta-lactamase gene and its homologs were throughout the microbial world and to propose a sequence-based classification of the members of the OXA-domain family derived from their phylogenetic relationships. Starting from 80,490 genomes, we identified a total of 24,916 DBLs and DBL-homolog sequences, which we classified into 64 clusters of proteins. Furthermore, we synthesized and expressed ten gene sequences sampled from ten clusters devoid of characterized members and conducted a survey of their activity. This revealed that three of them had an oxacillinase activity, including two able to hydrolase imipenem, reminding us how environmental bacteria represent an enormous reservoir of resistance factors that can be transferred to clinical strains.

SQL database
Bioinformatic data generated in this study was stored into a sqlite3 database ( Figure   S1). This database was exploited using SQL queries in order to generate additional results and statistics.

Domain characterization of OXA-domain family proteins
The potential presence of a signal peptide was predicted using local SignalP-5.0b (30). The organism option was set to 'gram+' for sequences belonging to Firmicutes and Actinobacteria and 'gram-' for the other phyla. To improve the prediction of Localization and genetic environment of OXA-domain family proteins A genetic environment database was built from the bacterial genomes featuring at least one OXA-domain family sequence using GeneSpy "3 in 1" module, as described in the manual (34). Contig accessions were retrieved from the database and the corresponding FASTA files were downloaded using the command-line version of the "efetch" tool from the NCBI Entrez Programming Utilities (E-utilities).
PlasFlow v1.1 was used to predict potential plasmid sequences in the contig FASTA files (35).

Clinical strain determination
BioSample reports associated with bacterial organisms were also downloaded using efetch (see above). All words of a report were collected and fed to a script that renamed and standardized them using an OBO (Open Biomedical Ontologies) dictionary. A score was attributed to each standardized word: +1 for a "clinical" word, 0 for an uninformative word and -1 for a non-clinical word. At last, a final score was computed for each BioSample according to its collection of standardized words (see figshare). A bacterial strain was considered as "clinical" when its metadata was associated with a positive score, "non-clinical" for a negative score and not classified for a null score.

Alignment and phylogenetic analysis
After deduplication using CD-HIT v4.6 (26) with a global sequence identity threshold of 0.95, OXA-domain family protein sequences were aligned using MAFFT v7.273 (27). Alignments were then carefully optimized by hand using the program "ed" and alignment columns were manually selected using the program "net", both part of the MUST software package (36). The resulting matrix of 1413 sequences x 188 unambiguously aligned AAs was used to infer a phylogenetic tree with RAxML

Phylogenetic clustering
To produce clusters of related OXA-domain family sequences, the phylogenetic tree was first converted to a phylo4 object using the readNewick function of the phylobase R package (39). Then, a patristic distance matrix (dist.mat) was computed using the distTip function from the adephylo R package (40) and an adjacency matrix (adj.mat) was computed as follow: In general, surveyed bacteria possess only either one DBL-homolog protein (9964 strains) or one BlaR-homolog protein (5874 strains). In 1665 and 1813 strains, we found two DBL-homologs or two BlaR-homologs, respectively, and rarely more than two DBL-homologs (49) or BlaR-homologs (5). In addition, 963 strains simultaneously possess one DBL-homolog and one BlaR-homolog, while 10 strains show more than one DBL-homolog and one BlaR-homolog or the opposite (Fig. 1c). To assess the transfer potential of DBL-homolog and BlaR-homolog genes, and therefore the propensity of emergence of a new resistance, we looked for transposase genes in the vicinity of these genes (Fig. 2b). We noticed that DBLhomolog and BlaR-homolog genes are either close to transposase genes (distance from one to five genes) or very distant (more than 15 genes) in each genetic context. Balnealaoeta are close to a transposase gene, which suggests that they might have been acquired by gene transfer. In contigs not classified by PlasFlow, we observed a higher prevalence of DBL-homolog genes than BlaR-homolog genes, and these DBL-homologs are very distant from transposase genes. As this pattern is similar to the pattern observed for chromosomes (Fig. 2b), it indicates that unclassified contigs likely correspond to chromosomes.

Signal peptide and transmembrane segment prediction
Most DBL-homolog sequences are characterized by a signal peptide (SP), as predicted by SignalP (

Prevalence of DBL-homolog genes in clinical strains
Acquired resistance in clinical bacterial strains is a very important concern, but determining the clinical origin of a given bacterial isolate only based on the metadata of the corresponding genome assembly is still challenging to automate at a large scale. Indeed, BioSample reports from the NCBI can contain such information but these remain difficult to analyze due to the lack of a controlled vocabulary. To overcome this difficulty, we used a script that standardizes all the words of a  (Table S1). In general, there is little taxonomic diversity within each cluster. Indeed, the majority of these clusters (50) contain sequences from organisms belonging to the same phylum or class.
Annotating the unique sequences using BDLB reference sequences at an identity

Assessment of the beta-lactamase activity in uncharacterized clusters
To test the beta-lactamase activity of some of the 46 non-annotated clusters, ten DBL-homolog sequences were selected for expression and production. Clusters were sorted from the largest to the smallest (considering all and not only representative sequences), then one sequence from the first ten cluster with no DBL found in the BLDB, a sequence length between 250 and 350 AAs and no mutation in the three conserved motifs defining the class D active site. Thus, the ten DBLhomolog (termed OXAVL01 to 10) were selected from clusters 14,22,23,28,30,39,41,42,44 and 57 (Table S2). OXAVL01 has the 2 lysines of its active site mutated The evaluation of the beta-lactamase activity on crude cell extracts (Table 3) showed that only OXAVL02 and OXAVL06 were able to hydrolyze all beta-lactams tested, including imipenem. OXAVL09 was active versus nitrocefin, ampicillin and oxacillin but not imipenem. OXAVL03 was able to hydrolyze nitrocefin and ampicillin. Cell extracts of OXAVL04, OXAVL05 and OXAVL10 were active only against nitrocefin.
The DBL-homolog enzymes were not produced in an active form in the strains bearing the plasmid pOXAVL01, pOXAVL07 or pOXAVL08.

OXAVL02 and OXAVL06 have carbapenemase activity
Since crude extracts of OXAVL02 and OXAVL06 were the only ones able to hydrolyze all tested beta-lactams and had the highest level of expression in the soluble fraction, we focused our work on those two hydrolases. The purification of the two enzymes included three chromatographic steps, namely an anion exchanger, an IMAC affinity chromatography and a molecular sieve. For OXAVL02, the purification consists in an IMAC column followed by a strong anion exchanger high SEC experiments revealed that the OXAVL02 elutes in three major peaks (Fig. 3a), with one at an elution volume typical of a monomeric DBL (~260 mL). The two additional peaks elute at about 230 mL and 180 mL, which is similar to the elution volume of the dimer and multimer, respectively. The three peaks displayed an oxacillinase activity. Due to the low precision of oligomeric states of the proteins determined by SEC, we further characterized these three peaks using SEC-MALS (Fig. 3b).  (Table 4). Indeed, in absence of hydrogenocarbonate, their activity generally showed an initial burst, followed by a pronounced slowdown, even when the substrate conversion and product accumulation were quite low. Our data indicates that OXAVL02 displays a lower catalytic efficiency compared to OXAVL06.
We observed that both enzymes were not able to hydrolyse amoxicillin, temocillin, cefazolin and cefotaxime. In addition, OXAVL06 was not active against piperacillin and meropenem. We confirmed also that the two beta-lactamases displayed a carbapenemase activity. Imipenem was among the best substrates (kcat/Km = 0.016 and 2.5 µM -1 s -1 for OXAVL02 and OXAVL06, respectively). In comparison to values obtained for oxacillin, the kcat/Km ratios of OXAVL02 for meropenem and imipenem were 30 and 10-fold higher, respectively.  (Table S1). Most BlaR-homologs harbor a polar residue as the third residue of the second conserved motif (Table S1) (Table S1), which regroups so far only pseudo-DBL sequences like YbxI or BAC-1 (17,18).

Discussion
Probing clusters without class D beta-lactamase representative Beyond OXAVL01, we have selected nine DBL-homologs among the 45 clusters devoid of reference OXA-domain family proteins to probe their activity. Overall, for seven of the ten sequences selected for evaluation, a beta-lactamase activity was detected at least on crude extracts (Table 3), including two hydrolases active on imipenem (OXAVL02 and OXAVL06). The enzymatic studies of these two DBLhomolog enzymes confirmed that they both display a beta-lactamase activity and hydrolyse efficiently imipenem but that meropenem is only inactivated by OXAVL02.
We also showed that the presence of hydrogenocarbonate enhances their catalytic activity, a sign of the necessary carboxylation of the first motif lysine for optimal activity. As already shown for numerous other class D beta-lactamases, the monomeric form OXAVL02 is in equilibrium with the dimeric form, the monomer being the predominant form of the enzyme at the concentration tested. These results, obtained with randomly selected enzymes, confirm that the environmental strains provide a large reservoir of new resistance genes, which include high potential for resistance to carbapenems, a family of last resort antibacterials. The Predicting activity profiles from amino acid sequences The most clinically relevant result would be to deduce the activity profile of an enzyme from its AA sequence. However, determining the activity of only one representative DBL-homolog per cluster would not be informative of the specific activity profile of the cluster. In fact, it has been shown that only one mutated AA can alter the activity profile of a DBL (8,9). Although the sequence similarity between the 1413 representative sequences and their respective member sequences is high (i.e., at least 95% identity), the identity between the sequences within one of the 45 nonsingleton phylogenetic cluster is low (i.e., down to 50%) (Table S3). Furthermore, this similarity is certainly undervalued because it is computed from only 188 unambiguously aligned AAs. Altogether, those arguments support that, for now, the activity profile of a DBL-homolog cannot be predicted only based on its AA sequence. This problem is also true for the other classes of beta-lactamases.
Solving this would require a major effort for the high throughput biochemical characterization of the enzymes and the determination of their 3D structure, which is more likely correlated with the substrate specificity than the AA sequence. While the first part still represents a significant bottleneck, the recent development in 3D structure prediction with the AlphaFold software (50) has put the second part within reach, and the use of artificial intelligence to predict the activity profile of enzymes is not as far fetched as it used to be.